Cloud activity threat detection for sparse and limited user behavior data

ABSTRACT

A cloud security system and method implements cloud activity threat detection using analysis of cloud usage user behavior. In particular, the cloud security system and method implements threat detection for users, cloud service providers, or tenants (enterprises) of the cloud security system who are new or unknown to the cloud security system and therefore lacking sufficient cloud activity data to generate an accurate behavior model for effective threat detection. In accordance with embodiments of the present invention, the cloud security system and method performs user behavior analysis to generate generalized user behavior models for user groups, where each user group includes users with similar cloud usage behavior. The user behavior models of the user groups are assigned to users with sparse cloud activity data. In this manner, the cloud security system and method of the present invention ensures effective threat detection by using accurate and reliable user behavior models.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 16/410,306, filed May 13, 2019, which is a continuation of U.S. patent application Ser. No. 15/260,189, filed Sep. 8, 2016, each of which is hereby incorporated by reference herein in its entirety.

BACKGROUND OF THE INVENTION

Cloud computing is the use of computing resources, including hardware and software, that are delivered as a service over a network, typically the Internet. As cloud computing achieves increased popularity and adoption of cloud-based services by businesses increases, concerns over security and risks of using these cloud-based services become significant. Traditionally, systems and software applications were deployed in enterprise environments, such as within an enterprise's own private data network, with strict controls and policies to ensure that data and usage are compliant with the enterprise's standards. However, the adoption of cloud-based services offered by third parties creates a potential mismatch, or complete absence, of expected enterprise level controls. Enterprises are faced with the challenge of accessing risk exposure associated with the use of cloud-based services in order to apply compensating controls.

In particular, an enterprise may adopt the use of one or more cloud-based services to support its business operation. The enterprise may wish to monitor the cloud usage activities to detect for potential or actual risks or threats to the enterprise's operation, such as data breach or compromised access or other security issues.

BRIEF DESCRIPTION OF THE DRAWINGS

Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.

FIG. 1 is a diagram illustrating an environment in which a cloud security system of the present invention can be deployed in some embodiments.

FIG. 2 is a flowchart illustrating a cloud security method using cloud usage user behavior analysis according to embodiments of the present invention.

FIG. 3 illustrates data tables in the database 90A for storing the cloud usage activity data in embodiments of the present invention.

FIG. 4 is a flowchart illustrating a method to generate user behavior models in the cloud security method of FIG. 2 in embodiments of the present invention.

FIG. 5 is a flowchart illustrating a method to detect anomalies in the cloud security method of FIG. 2 in embodiments of the present invention.

FIG. 6 illustrates an implementation of the cloud security method and illustrates an example of the sequence of user behavior models being applied to users of the cloud security system in operation.

DETAILED DESCRIPTION

The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. Unless stated otherwise, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. As used herein, the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.

A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.

According to embodiments of the present invention, a cloud security system and method implements cloud activity threat detection using analysis of cloud usage user behavior for users with sparse, limited or no cloud activity data. In particular, the cloud security system and method implements threat detection for users, cloud service providers, or tenants (enterprises) of the cloud security system who are new or unknown to the cloud security system and therefore lacking sufficient cloud activity data to generate an accurate behavior model for effective threat detection. In accordance with embodiments of the present invention, the cloud security system and method performs user behavior analysis to generate generalized user behavior models for user groups, where each user group includes users with similar cloud usage behavior. The user behavior models of the user groups are assigned to users with sparse cloud activity data, such as new users to the system, users using new cloud services, or users of new tenants (enterprises) to the system. In this manner, the cloud security system and method of the present invention ensures effective threat detection by using accurate and reliable user behavior models.

More specifically, the cloud security system and method of the present invention identifies security incidents or anomalies in the cloud activity of a particular user based on deviation from a typical user behavior. As such, an accurate user behavior model for each user is needed to ensure effective and reliable threat detection. The user behavior model describes what cloud activity behavior is considered normal for a user and what cloud activity behavior is considered not normal for the user. Cloud activity behavior that is not normal may be a threat to the enterprise or to the cloud service provider. For example, downloading 10 megabytes of data may be normal cloud activity behavior for user A but downloading 10 gigabytes of data may be not normal cloud activity behavior for the same user. In another example, downloading 10 megabytes of data may be normal cloud activity behavior for user A but downloading 10 megabytes of data may be not normal cloud activity behavior for user B. The user behavior model establishes the threshold against which abnormal cloud activity, indicative of threats or anomalies, of a particular user can be detected. The user behavior model for a user is derived from the cloud activity data of the user observed over time. When the user's behavior changes over time, the user's behavior model may be updated. But cloud usage activity that indicates sudden changes in the user's behavior will be flagged as abnormal user behavior.

However, in cases where the cloud security system encounters a new user, or a user using a new cloud service provider not previously used, or a user from a new tenant to cloud security system, there may often be insufficient cloud activity data to develop a usable user behavior model for the user for effective threat detection. That is, a reliable user behavior model is developed with a sufficient amount of historical cloud activity data for the user. For example, several weeks of a user's cloud activity data may be needed to establish a reliable user behavior model for the user. The more the historical cloud activity data is available for the user, the more reliable or more confidence there is in the user behavior model. However, there are cases where the cloud activity data for a user is too sparse to develop a reliable behavior model for security threat detection. In accordance with embodiments of the present invention, the cloud security system and method generates generalized user behavior models for user groups for users with common cloud user behavior. The cloud security system then applies these generalized user behavior models to users with sparse cloud activity data. That is, when a user's cloud activity data is too sparse for a reliable behavior model to be developed, the user is assigned to a user group based on matching of cloud activity attribute values and the generalized user behavior model for the user group is assigned to the user. In this manner, a generalized user behavior model is used to establish an initial threshold for cloud security threat detection for the user. The cloud security system can then detect for potential or actual threats from cloud activities conducted by a user based on deviation from the generalized user behavior model selected for the user. The threat detection for the user can have greater accuracy even when the cloud activity data of the user is still sparse.

The cloud security system and method of the present invention is particularly useful in threat detection operations when the cloud activity data for a user, for a cloud service provider, or for a tenant of the cloud security system is sparse and accurate user behavior models cannot be developed. In some embodiments, the cloud security system may be employed to monitor cloud activities for a number of users using a number of cloud service providers. The users may belong to different enterprises who are tenants of the cloud security system. In the present description, the cloud activity data for an entity is considered sparse when the cloud activity data is insufficient to develop an accurate user behavior model for the entity. The cloud activity data may be limited or completely lacking, that is, there may be no data at all for the entity using the particular cloud service.

When a new user is added to the cloud security system to be monitored, the cloud security system may have only limited or sparse cloud activity data for that new user. The limited or sparse cloud activity data is not sufficient to develop a reliable user behavior model for that user so that effective threat detection can be performed. Similarly, the cloud security system may be monitoring cloud activities from users to a number of cloud service providers. Users may start using a new cloud service provider for which the cloud security system does not have sufficient cloud activity data to generate a user behavior model for that cloud service provider. Alternately, a user, who has used other cloud based services, may start to use a new cloud service provider that the user has not used previously. Therefore, the cloud security system does not have sufficient cloud activity data to generate a user behavior model for the user using the new cloud service provider. In further examples, users may start using certain cloud service actions not previously known to the cloud security system. The cloud security system therefore does not have sufficient cloud activity data to model the user behavior with respect to the new service actions.

Lastly, the cloud security system may be monitoring cloud activities on behalf of its tenants who are typically enterprises. Users of the enterprises become users of the cloud security system. The cloud security system may introduce a new tenant including a group of new users. When a new tenant is added to the cloud security system, the cloud security system may have only limited or sparse cloud activity data for the users of the new tenant. The limited or sparse cloud activity data is not sufficient to develop reliable user behavior models for the users so that effective threat detection can be performed for users of the new tenant.

In these cases, the cloud security system and method of the present invention can be advantageously applied to generate generalized user behavior models for user groups of users sharing common cloud usage behavior. The generalized user behavior models can then be applied to users with sparse activity data due to any of the aforementioned scenarios. In this manner, the cloud security system and method is able to provide accurate threat detection right from the time the new user or the new cloud service provider or the new cloud service action or the new tenant is introduced to the cloud security system. Enhanced cloud security for users and tenants of the cloud security system is ensured. Specifically, the cloud security system and method improves the operations of computers and computer systems by providing protections against threats and unauthorized intrusion to a computer or a computer network.

In the present description, a “cloud-based service” or “cloud service” refers to computing resources, including hardware and software, that are delivered as a service over a data network. In most cases, the cloud service is deployed on the publicly available Internet. In embodiments of the present invention, the cloud-based service may also be deployed on a private data network of an enterprise, or be deployed on a third party private data network, or be deployed on a personal data network.

With the proliferation of cloud-based services, an enterprise may wish to adopt one or more cloud-based services for data storage or other applications. The users or employees of the enterprise may access the cloud-based services within or outside of the enterprise's own data network. In some cases, the enterprise may force network traffic between a client device and a cloud-based service to be re-directed through a network intermediary. However, in most cases, the users of the enterprise access the cloud-based services without going through the enterprise data network. Regardless of the access methods, the enterprise may wish to deploy security measures to monitor and control the use of the cloud-based services by the enterprise's employees and users. In embodiments of the present invention, the enterprise employs a cloud service security system which implements a method to perform threat detection based on user behavior analysis.

FIG. 1 is a diagram illustrating an environment in which a cloud security system of the present invention can be deployed in some embodiments. Referring to FIG. 1 , one or more enterprises may adopt the use of one or more cloud-based services provided by cloud service providers (CSP) 30, such as CSP1, CSP2 and CSP3 shown in FIG. 1 . The users or employees 10 of the enterprises may access the cloud-based services on behalf of the enterprises. Furthermore, the users of the enterprises may access the cloud-based services directly, without going through each enterprise's own data network. In some cases, some of the enterprises may deploy a network intermediary 20 and configure the cloud service providers to force network traffic to be redirected through the network intermediary 20.

To ensure security associated with the use of one or more cloud based services, the enterprises employ control and security measures to detect and contain potential or actual threats to the enterprise's data being communicated to and/or stored on the cloud service providers. In embodiments of the present invention, an enterprise may employ the cloud security system 50 of the present invention to implement cloud usage threat detection based on user behavior analysis. More specifically, the cloud security system 50 is advantageously applied to detect for security incidents, anomalies and threats even when there is insufficient cloud activity data to develop reliable threat detection models.

In embodiments of the present invention, cloud security system 50 is a multi-tenant system hosting cloud security monitoring and analysis for multiple tenants. The tenants, denoted as T-A, T-B, T-C and T-D, etc., in FIG. 1 , are enterprises adopting the use of cloud-based services from cloud service providers 30 and wishing to deploy measures for cloud usage threat detection. Accordingly, the enterprises become tenants of the cloud security system 50 and employees or users of the enterprises are added to the cloud security system as users of the cloud security system. In the present description, references to “users of the cloud security system” refer to the users or employees of the enterprises who are tenants of the cloud security system. The users 10, denoted as User-A, User-B, User-C and User-D in FIG. 1 , are associated with respective tenants and access one or more of the cloud service providers being monitored by the cloud security system 50.

Cloud security system 50 includes a cloud activity monitoring system 60 configured to receive and process user activity logs from one or more of the cloud service providers 30. In some embodiments, the cloud activity monitoring system 60 is a multi-tenant system, handling user activity logs from multiple cloud service providers on behalf of multiple tenants T-A, T-B, T-C and T-D. In particular, each cloud service provider 30 provides user activity logs containing network traffic data handled by the cloud service provider on behalf of the enterprises. In some examples, the cloud security system 50 obtains the user activity logs using an application program interface (API) of the cloud service provider. In some embodiments, the user activity logs contain information related to the cloud activity accessed by the users, such as the IP addresses of the originating device, the service actions being taken, and other related information such as the time of the access and the amount of data being accessed. The cloud activity monitoring system 60 processes the user activity logs to generate cloud usage activity information associated with the cloud service providers and the users. The cloud usage activity information is referred to herein as “processed cloud activity data” or “cloud activity data.” In some examples, the cloud activity monitoring system 60 processes the user activity logs by parsing the activity logs, identifying the cloud service provider, the tenant and the user, such as based on a cloud service provider ID, a tenant ID, and a user ID.

Cloud security system 50 includes a cloud activity threat detection system 80 configured to receive an event stream of cloud activity data from the cloud activity monitoring system 60 and to detect for threat in the event stream of cloud usage activity for each user. More specifically, the threat detection system 80 detects for each user's actions or activities at the cloud service providers that may indicate a security risk to the respective enterprise. In the present description, a threat in the cloud usage activity refers to security incidents and other potential or actual risks to the enterprise resulting from the use of the cloud based services.

The threat detection system 80 may apply one or more heuristics to the cloud usage activity event stream to determine if certain user cloud usage activity should be deemed a threat or a security risk to the enterprise's data. In some embodiments, the threat detection system 80 may generate alerts or reports of the detected threats or anomalies.

In embodiments of the present invention, the threat detection system 80 is capable of identifying cloud usage activities that may be potential security threats. In particular, the threat detection system 80 identifies security incidents or anomalies in the cloud activity of a particular user based on deviation from a typical user behavior. In practice, a threat detection threshold is set for each user of the cloud security system 50 and the cloud activity threat detection system 80 identifies anomalies by comparing the cloud usage activity information of a user to the user's threat detection threshold. Alternately, one or more threat detection criteria are established for each user of the cloud security system 50 and the cloud activity threat detection system 80 identifies anomalies by checking the cloud usage activity information of a user against the user's threat detection criteria. The user's threat detection threshold or criteria is derived from the model coefficients describing the user behavior model of the user as applied to the prior cloud usage data of the user.

In cloud security system 50, a threat detection model estimation system 70 is configured to receive the cloud usage activity data of the users and to generate user behavior models for each of the users. The threat detection model estimation system 70 generates user behavior models for users where the user behavior model is described by a set of model coefficients. With the user behavior models thus generated, the model coefficients of a user is applied to the prior cloud usage data of the user to detect threats or anomalies in the user's cloud activity event stream.

However, in some cases, the user's cloud activity data may be very sparse or non-existence. When the cloud activity data for a user is sparse, the threat detection model estimation system 70 cannot generate a user behavior model for the user with high confidence. That is, the user threat detection threshold or criterion generated from the user behavior model may not be accurate so that the cloud activity threat detection operation may result in too many false positive or false negative results, neither of which is desirable. The sparse cloud activity data usually results from a new user, new cloud service provider or a new tenant being introduced to the cloud security system 50.

For example, the cloud security system 50 have been monitoring the cloud activities for user-A, user-C, and user-D belonging to one or more tenants. In the present example, user-A connects to cloud service provider CSP1 through the network intermediary 20. Meanwhile, user-C and user-D connect to the cloud service providers CSP1, CSP2 and CSP3 directly. A new employee may be added to a tenant. The new employee becomes a new user of the cloud security system 50. For examples, a new user, such as user-B, may be added to the cloud security system 50. During an initial period when user-B is added, the cloud security system 50 may not have sufficient cloud activity data for user-B to generate a reliable user behavior model to monitor the cloud activity of user-B for threat detection.

In other examples, the cloud security system 50 maybe configured to monitor a new cloud service provider, such as CSP3. The new cloud service provider CSP3 may be used by users of the cloud security system, such as user-D. During an initial period when cloud service provider CSP3 is introduced, the cloud security system 50 may not have sufficient cloud activity data for the new cloud service provider CSP3 to generate a reliable user behavior model to monitor the cloud activity of users using this new cloud service provider.

In another example, a user, such as user-C, who is already a member of the cloud security system 50, may be using the cloud service provider CSP1 and then starting to use a new cloud service provider, such as CSP2. Although the cloud service provider CSP2 may be used by other users, such as user-D, of the cloud security system 50, the cloud security system 50 may not have sufficient cloud activity data for user-D using the new cloud service provider CSP2 during an initial period to generate a reliable user behavior model to monitor the cloud activity of user-D using this new cloud service provider.

In some examples, a user may start using a new service action at a cloud service provider. Service actions may include activities such as login, logout, upload data, download data, preview emails, update data field, and others. A user may have used some of the service actions at the cloud service provider but not others. When the user starts to use a new service action, the cloud security system 50 may not have sufficient cloud activity data for the user using the new cloud service action to generate a reliable user behavior model to monitor the cloud activity of the user.

In yet another example, the cloud security system 50 may be engaged to monitor users of several enterprises, such as tenants T-A, T-B, and T-C. Subsequently, the cloud security system 50 may be engaged to monitor a new tenant T-D. Users or employees of tenant T-D are added to the cloud security system 50 as users of the cloud security system. For the users of new tenant T-D, the cloud security system 50 does not have sufficient cloud activity data for the users to generate reliable user behavior models to monitor the cloud activity of the users.

According to embodiments of the present invention, the threat detection model estimation system 70 performs cloud usage user behavior analysis to generate generalized user behavior models for user groups including users with similar user behavior. With the user behavior models generated for a number of user groups, the threat detection model estimation system 70 may then match a user with sparse or limited cloud activity data for any reason to a user group based on common cloud activity attributes. In this manner, the threat detection model estimation system provides reliable user behavior models to the cloud activity threat detection system 80 to generate reliable threat detection thresholds or criteria to identify and detect for cloud activity anomalies for users with sparse cloud activity data. The threat detection model estimation system 70 is configured to receive the cloud usage activity information from the cloud activity monitoring system 60 and to generate user behavior models for users of the cloud security system. Each user behavior model is described by a set of model coefficients. In one example, a user behavior model is generated to describe the user's behavior for a user for performing a certain service action at a certain cloud service provider. In embodiments of the present invention, the threat detection model estimation system 70 operates in communication with a database 90 storing model coefficients describing the user behavior models and also storing processed cloud activity data. The database 90 can be implemented using a single database server or multiple database servers. The exact configuration of database 90 is not critical to the practice of the present invention.

The threat detection model estimation system 70 generates the user behavior models with model coefficients stored in the database 90. The cloud activity threat detection system 80 retrieves the model coefficients and the prior cloud usage data for a user from the database 90 to generate the threat detection threshold or criteria for a given user. In the event the cloud activity data for a user is sparse or non-existence, the threat detection model estimation system 70 assign a user behavior model for a user group to the user by matching the user to a user group with common cloud activity attributes. In this manner, regardless of the status of the user, the cloud activity threat detection system 80 can analyze the cloud activity data for the users relative to the user threat detection thresholds to identify potential threats and anomalies in the user activity. The threat detection system 80 may then generate alerts or reports of the detected threats or anomalies.

In embodiments of the present invention, the cloud security system 50 can be implemented in hardware or in software. For example, the cloud security system 50 may be implemented in a hardware processor programmed with instructions to perform the method described above to monitor the cloud usage activity and to perform threat detection. Furthermore, the hardware processor may be programmed with instructions to generate threat detection models based on user behavior analysis, as described above.

FIG. 2 is a flowchart illustrating a cloud security method using cloud usage user behavior analysis according to embodiments of the present invention. The cloud security method of FIG. 2 can be implemented in the cloud security system 50 of FIG. 1 in embodiments of the present invention. In some embodiments, the cloud security method 100 is implemented as software executed on a hardware processor or a microprocessor in a computing device. Referring to FIG. 2 , the cloud security method 100 receives cloud usage activity data in the user activity logs provided by one or more of the cloud service providers (102). The user activity logs contain information on activities of users accessing the cloud-based services on behalf of one or more tenants. The cloud usage activity data may be provided on a periodic basis, such as hourly.

The cloud security method 100 then aggregates and stores the cloud usage activity data (104). In one example, the method 100 aggregates the cloud usage activity data for each tenant-CSP combination. Furthermore, the method 100 may aggregate the cloud usage activity data over various time intervals. For example, the method 100 may receive cloud usage activity data on an hourly basis and aggregates the cloud usage activity data into a 4-hour block, on a daily basis or on a work-week basis. The method 100 stores the cloud usage activity data, before and after aggregation, into a database 90A as prior cloud usage data. Database 90A can be part of the database 90 in FIG. 1 . The cloud usage activity data is stored as a time series of data for each user, as described in more detail below with reference to FIG. 3 .

FIG. 3 illustrates data tables in the database 90A for storing the cloud usage activity data in embodiments of the present invention. Referring to FIG. 3 , the cloud usage activity data received by cloud security system and method of the present invention may be configured in data tables grouped by tenants. The tenants are identified by tenant identifier TID. For each tenant, the database stores data tables for each cloud service provider, identified by cloud service provider identifier CSPID. For each tenant-cloud service provider combination, the database also stores a set of data tables, one data table for each service action for that cloud service provider. The service action is identified by the identifier SA. Thus, for each tenant, the data base stores data tables for each cloud service provider and service action combination. For example, for tenant TA and cloud service provider Box, there is a data table Box.login for the service action “login” and a data table Box.upload for the service action “upload”.

Accordingly, for each tenant, a set of data tables 160 is provided where each data table 150 is associated with a cloud service provider and service action (CSD:SA) combination. Each data table 150 stores the time-series data for each user for the respective CSP:SA combination. In particular, the users are identified by the tenant identifier and a user identifier as TID:UID. Each data table stores the time-series data for all of the users of the tenant for that CSP:SA combination. In particular, the data table stores the time-series data for a set of cloud activity attributes (Att) relating to the service action. For example, the cloud activity attributes can include the number of bytes in the service action (e.g. upload or download), the number of times the CSP has been visited, the number of reports accessed. Each attribute of the service action is being collected as a time-series of data. The time-series of data may be aggregated over different level of temporal granularity.

In embodiments of the present invention, a data table for a CSP:SA combination further stores time-series data for a set of attributes belonging to an associated service action related to the service action of the data table. An associated service action is one which has some correlation with the service action being measured. For example, the service action being tracked may be “data download” and the associated service action may be “previews of emails”. If there is a large amount of data download, then the number of previews in that same time period should be fewer. The data recorded for the service action being measured and the associated service action provides insight into the cloud usage activity data. In particular, the user behavior model can then evaluate the service action being measured in view of the associated service action to provide a more accurate behavior model of the user's cloud usage activity.

As thus configured, for each tenant, a set of data tables 160 are provided to store users' cloud usage activity data for each cloud service provider and service action combination. With the cloud usage activity data for each user thus stored and aggregated, the method 100 then analyzes the aggregated cloud usage activity data for each user to generate user behavior models described by model coefficients (106). The cloud usage activity data for the user is collected over time and aggregated. The time-metered data for each user is then processed to generate a user behavior model for each user. In some embodiments, the cloud security method analyzes the cloud usage data using auto-regressive modelling. The auto-regressive modeling may include a first step of auto-regressive parameter estimation where the time-metered data is fitted as an auto-regressive process of a given order. In the operation, the cloud usage activity data is compressed to generate a set of time-series coefficients or AR coefficients. Then, the auto-regressive modeling may include a second step performing Linear Predictive Coding. The AR coefficients estimated previously are used to generate LPC coefficients of the same order.

As a result, the method 100 generates for each user a set of coefficients as the user behavior model. For example, the user behavior model may include a set of 5-10 coefficients describing the user behavior for that service action at the cloud service provider. The method 100 then analyzes the user behavior models thus generated to form user groups with similar behavior (108). More specifically, the method derives a generalized user behavior model for a user group where the users of the user group share common cloud usage activity behavior as indicated by the coefficients. The generalized user behavior model may be derived from data from a large number of users. In some embodiments, the cloud security method aggregates data from the users of the multiple tenants of the cloud security system to generate the generalized user behavior model. In other embodiments, the cloud security method aggregates data from the users of a specific tenant of the cloud security system to generate the generalized user behavior model. In the present description, a generalized user behavior model derived from data of users from multiple tenants is referred to as a generic user behavior model while a generalized user behavior model derived from data of users from a single tenants is referred to as a tenant level user behavior model. By aggregating the cloud usage activity data for a number of users in the user group, the method 100 is able to derive generalized user behavior models with high confidence values.

With the generalized user behavior models thus generated, the cloud security method 100 can then apply the generalized user behavior models for users with sparse cloud usage activity data (110). In some cases, the method 100 determines that the user is a new entity with sparse cloud usage activity data. More specifically, the method 100 may determine that the user is a new user, or the user starts using a new cloud service provider, or the user starts using a new service action from a new or previously used cloud service provider. Alternately, the method 100 may determine that the user belongs to a new tenant. Accordingly, when the method 100 detects a user with sparse cloud activity data, the method 100 assigns the user to a user group and assigns the generalized user group behavior model to the user with sparse data. In some embodiments, a set of model coefficients is computed for the user with sparse data and the method 100 assigns the user to a user group where the coefficients of the user group are the least distance from the computed coefficients. The coefficients of the user group become the coefficients for the user.

The cloud security method 100 then stores the model coefficients for the users (112). In some embodiments, the model coefficients are stored in a data base 90B which can be part of the database 90 in FIG. 1 .

The cloud security method 100 analyzes threats from the cloud activities for the users in the cloud security system. The cloud security method 100 uses the user behavior model established for each user to establish the threat detection threshold or criterion for the user. To that end, for each user, the method 100 retrieves the model coefficients associated with the user behavior model of the user and further retrieves the prior cloud usage data from the databases 90A and 90B. The method 100 generates a threat detection threshold or criterion for the user using the model coefficients and the prior cloud usage data (114). For example, to establish the detection threshold for Wednesday for a user, the model coefficients together with the cloud activity data for Monday and Tuesday are used. In the present description, the prior cloud usage data refers to a time period before the current time period for which threat is being detected.

With the threat detection threshold or criterion thus established for the user, the method 100 then applies the threat detection threshold to detect anomalies in the cloud usage activities of the user (116). The threat detection thresholds or criterion identifies cloud usage activities of the user as normal activities or abnormal activities. Abnormal activities or detected anomalies in the cloud usage activities can be indications of security incidents or potential or actual threats from the cloud usage activities. The method 100 performs an action based on the detected anomalies (118). In some embodiments, the method 100 may rank the anomalies based on their risk levels. Some anomalies may be deemed low risk and some anomalies may be deemed high risk requiring immediate attention by the enterprise. The method 100 may filter the anomalies to remove anomalies that are deemed low risk or have a low risk rating. Finally, the method 100 may report the anomalies. In some cases, the method 100 may report the anomalies with ranking of the risk levels.

The user behavior models and model coefficients generated by the cloud security method 100 are periodically updated or refreshed based on the aggregated cloud activity data being received and stored for each user (120). In some embodiments, the model coefficients are updated at regular time intervals. In other embodiments, the method 100 may initiate a refresh operation to re-compute the model coefficients based on the aggregated cloud usage activity data.

A salient feature of the cloud security system and method of the present invention is that a user behavior model for a user can be generated where there is a lack of historical cloud usage activity data for that user or where there is sparse cloud usage activity data for that user. In this manner, the cloud security system and method can realize effective threat detection for all users and for all time periods. That is, even when the user is new to the system or when the use of the cloud service provider is new, the cloud security system can realize effective threat detection by establishing a threat detection threshold or criterion from a generalized user behavior model of a user group. The user behavior model is adaptive and can be updated continuously or at regular interval as more cloud activity data is collected for the user. However, the cloud security system and method ensures a reliable user behavior model is available to use for the user in the initial time period when the user has no historical activity data or has only very sparse activity data.

In the above description, the user behavior model is used to derive or generate a threat detection threshold or a threat detection criterion for the user to detect threats or anomalies in the cloud usage activity of the user. The threat detection threshold can be a parameter level or a criterion identifying or describing the distinction between normal or abnormal cloud usage behaviors for the user. In the present description, the term “threat detection threshold” is used to describe the parameter levels or criterions used to demarcate normal versus abnormal cloud usage behavior. The “threat detection threshold” can be a parameter level or one or more criteria.

More specifically, in some embodiments, the threat detection threshold is a parameter level, such as the number of bytes of cloud data downloaded, or the number of files accessed at the cloud service provider, or the time and hour of accessing certain cloud data. In other embodiments, the threat detection threshold includes criteria that indicate risky data access. For examples, the threat detection threshold can be criteria for abnormal user provisioning and/or the nature of the cloud data access activity. In one example, the sequence of user creation and user deletion within a period of M days can be used as the threat detection threshold. For example, a user who created and deleted another user within a short span of time can be indicative of risky data access. The risk level is higher when the newly created user has accessed or downloaded cloud data. Accordingly, the threat detection threshold does not need to be a parameter level or a numeric value. The threat detection threshold can be selected to protect against cloud data access and data loss risks as needed.

FIG. 4 is a flowchart illustrating a method to generate user behavior models in the cloud security method of FIG. 2 in embodiments of the present invention. The method 200 described in FIG. 4 can be implemented in the cloud security method 100 of FIG. 2 . Referring to FIG. 4 , the method 200 aggregates and stores the cloud usage activity data (202). In one example, the method 200 aggregates the cloud usage activity data for each tenant-CSP combination. Furthermore, the method 200 may aggregate the cloud usage activity data over various time intervals. For example, the method 200 may receive cloud usage activity data on an hourly basis and aggregates the cloud usage activity data into a 4-hour block, on a daily basis or on a work-week basis. The method 200 may store the cloud usage activity data, before and after aggregation, into the database 90A as prior cloud usage data.

The method 200 then classifies and aggregates the cloud usage activity data into time groups (204). For example, the cloud usage activity data may be aggregated into 4-hour interval and may be classified into a first category of office-hour activity data and a second category of non-office-hour activity data. The cloud usage activity data thus collected and aggregated is then processed to generate a user behavior model for each user. (206). In some embodiments, the cloud security method analyzes the cloud usage data using auto-regressive modelling. The auto-regressive modeling may include a first step of auto-regressive parameter estimation where the time-metered data is fitted as an auto-regressive process of a given order. In the operation, the cloud usage activity data is compressed to generate a set of time-series coefficients or AR coefficients. Then, the auto-regressive modeling may include a second step performing Linear Predictive Coding. The AR coefficients estimated previously are used to generate LPC coefficients of the same order.

As a result, the method 200 generates for each user a set of coefficients as the user behavior model. For example, the user behavior model may include a set of 5-10 coefficients describing the user behavior for that service action at the cloud service provider. The method 200 then analyzes the user behavior models thus generated to form user groups with similar behavior. In the present embodiment, the method 200 identifies users with similar coefficient values for the CSP service action (208). For example, the method 200 may cluster users within a certain time-group. In another example, the method 200 may cluster users based on the magnitude of byte usage being one of the attributes being measured for the service action. The method 200 group users with similar coefficient values into a user group and generate a generalized user behavior model for the user group (210).

With the generalized user behavior models thus generated, the cloud security method 200 can then apply the generalized user behavior models for users with sparse cloud usage activity data (212). For users with sparse cloud activity data, the method 200 identifies a user group with similar cloud activity attributes that can be assigned to the user. In some embodiments, the method 200 compares the attribute values of the service action being measured and assigns a user group based on similar attribute values to the user.

The method 200 assigns the user to a user group and assigns the generalized user group behavior model to the user with sparse data. Accordingly, the coefficients for the user group are assigned to the user (214). The cloud security method 200 then stores the model coefficients for the users (216). In some embodiments, the model coefficients are stored in a database 90B which can be part of the database 90 in FIG. 1 .

FIG. 5 is a flowchart illustrating a method to detect anomalies in the cloud security method of FIG. 2 in embodiments of the present invention. The method 250 described in FIG. 5 can be implemented in the cloud security method 100 of FIG. 2 . Referring to FIG. 5 , the method 250 analyzes threats from the cloud activities for the users in the cloud security system. The method 250 first selects a candidate user (252). The method 250 receives the event stream of cloud usage activity data for the candidate user. For example, the event stream may be the cloud usage data in a most recent time interval stored in the database 65.

The method 250 uses the user behavior model established for each user to establish the threat detection threshold for the user. To that end, for the candidate user, the method 250 retrieves the model coefficients associated with the user behavior model of the user (254). The method 250 determines a threat detection threshold for the user using the model coefficients and the prior cloud usage data, received from the prior cloud usage data database 90A (256). The prior cloud usage data belongs to a time period prior to the cloud usage data in the event stream. For example, the event stream may be providing cloud usage activity for the current week. The prior cloud usage data may contain cloud usage activity data from the previous 5 weeks. The threat detection threshold is determined based on the model coefficients and the prior cloud usage data.

With the threat detection threshold thus established for the candidate user, the method 250 then applies the threat detection threshold to detect anomalies in the candidate user's event stream (258). The threat detection threshold identifies cloud usage activities of the user as normal activities or abnormal activities. Abnormal activities or detected anomalies in the cloud usage activities can be indications of security incidents or potential or actual threats from the cloud usage activities. The method 250 may evaluate the risk score of the detected threat (260). For example, the detected threat may be designated as a low risk anomaly or a high risk threat. The method 250 may store the detected threat and the associated risk score (262). Finally, the method 250 may perform an action based on the detected threat and the associated risk score (264). For example, the method 250 may generate an alert when the risk score for the detected threat exceeds a given level.

In one example, the cloud security system and method has been monitoring the cloud usage activity of an enterprise. A new user Tom is added to the enterprise and to the cloud security system and method. The new user Tome started to use the cloud service Facebook. In an initial period, there is insufficient data to generate a reliable user behavior model for the new user Tom using Facebook. During the initial period, the cloud security and method determines that the new user's activity at Facebook is mostly similar to a user group A. The generalized user behavior mode of user group A is then assigned to the new user Tom. As time progresses, more and more cloud activity data may be collected for the new user Tom. In that case, the new user Tom's user behavior model may be updated based on Tom's own cloud usage activity data or Tom may be assigned to another user group based on his cloud usage activity attribute. By generating or assigning a user behavior model to the new user Tom, the cloud security system and method can provide an effective threat detection threshold to detect for anomalies in the cloud usage activity of the new user Tom, even during the period when Tom himself has insufficient cloud usage activity data.

FIG. 6 illustrates an implementation of the cloud security method and illustrates an example of the sequence of user behavior models being applied to users of the cloud security system in operation. Referring to FIG. 6 , at time T0, a new tenant is added to the cloud security system. The new tenant can be an enterprise wishing to deploy the cloud security system to protect from threats due to cloud usage activities from its employees. For those users of the tenant with sufficient cloud usage activity seed data to support generating individual user behavior models, a user behavior model is generated for each user and applied to the user immediately for threat detection. In the present example, user A has sufficient seed data and a user behavior model is built for user A using user A′s cloud usage activity seed data.

On the other hand, for those users with insufficient cloud usage activity seed data, the cloud security method begins to collect cloud usage data for the users. During this initial period, the cloud security method applies a generalized user behavior model to each of the users. The generalized user behavior models are generated using cloud usage activity data of all of the users of the cloud security system and across all the tenants. In the present example, the generalized user behavior models generated from multi-tenant users are referred to as generic user behavior models. The users with sparse or no cloud usage activity data are assigned to respective user groups. In one embodiment, estimated model coefficients are generated for the users with sparse or no cloud usage activity data and users are assigned to user groups having coefficients that are least distance from the users' estimated model coefficients. The generic user behavior model of the assigned user group is then applied to the user. In the present example, user B has no or sparse cloud usage activity data and user B is assigned to a user group and a generic user behavior model is assigned to user B for performing threat detection.

After a given time period, cloud usage activity data has been collected for the users of the new tenant. At time T1, the cloud security method generates generalized user behavior models using cloud usage activity data of users of the new tenant only. In the present example, the generalized user behavior models generated from tenant-specific users are referred to as tenant-level user behavior models. Then, for users with no or sparse cloud usage activity data, the tenant-level user behavior models are applied to the users. In the present example, at time T1, user B was previously assigned to a generic user behavior model and is now assigned to tenant-level user behavior model for performing threat detection. The time T1 can be at the end of the first week after the new tenant is added to the cloud security system.

After another time period, sufficient cloud usage activity data has been collected for user B to generate a user behavior model using the data from user B only. In the present example, at time T4, a user-level static model is generated for user B using the data collected for user B. The time T4 can be at the end of the fourth week after the new tenant is added to the cloud security system. Finally, after another time period, at time T8, sufficient cloud usage activity data has been collected for user B to allow a user behavior model to be generated for user B. The time T8 can be at the end of the eighth week after the new tenant is added to the cloud security system.

As a further example, a new user of the tenant may be introduced to the cloud security system at a time after the initial period T0. For example, the tenant may have added a new employee. The new user may have no or sparse cloud usage activity data. For example, user C may be added to the cloud security system around time T2 which can be the end of the second week. At time T2, the tenant-level user behavior model for the tenant has been built and user C is assigned to a tenant-level user behavior model. At time T4, at the end of the fourth week after the new tenant is added, a user-level static model is generated for user C using the data collected for user C. Then, after another time period, at time T9, sufficient cloud usage activity data has been collected for user C to allow a user behavior model to be generated for user C. The time T9 can be at the end of the ninth week after the new tenant is added to the cloud security system. When user C is added to the cloud security system a week later, the user level behavior model will be accordingly delayed by a week for user C as compared to user B to ensure sufficient cloud usage activity data is collected for user C.

Finally, in yet another example, the tenant may start using a new cloud service provider at a time after the initial period TO. There may not be any cloud usage activity data for users of the tenant using this new cloud service provider. For example, a CSP D may be added to the cloud security system around the second week, around time T2. At that time, the tenant-level user behavior models for the tenant has been built and the users are then assigned the tenant-level user behavior models for CSP D. In particular, tenant-level user behavior models are built for CSP D by using cloud usage activity data of other CSPs with similar attributes to CSP D. Each user of the tenant is assigned to a tenant-level user behavior model built for CSP D. At time T6, at the end of the sixth week after the new tenant is added, a user-level static model is generated for the users using CSP D using the data collected for the users. Then, after another time period, at time T10, sufficient cloud usage activity data has been collected for the users using CSP D to allow user behavior models to be generated for the users. The time T10 can be at the end of the tenth week after the new tenant is added to the cloud security system. When a new CSP is added to the cloud security system a week later, the user level behavior model will be accordingly delayed by one or more weeks to ensure sufficient cloud usage activity data is collected for using the new CSP. In the present example, eight weeks of cloud usage activity data is required to generate the user level static model for a user and twelve weeks of cloud usage activity data is required to generate the user behavior model for a user.

Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive. 

What is claimed is:
 1. A method of detecting anomalies in usage activities at one or more cloud-based service providers, the method comprising: determining that a user does not have sufficient usage activity data to generate a user behavior model for the user; assigning, using the hardware processor, the user to an assigned user group of a plurality of user groups, wherein each of the plurality of user groups represents similar user behavior of users in the user group and has a corresponding generalized user behavior model; assigning, using the hardware processor, the generalized user behavior model of the assigned user group as a user behavior model for the user; generating, using the hardware processor, a threat detection threshold for the user using coefficients of the user behavior model of the user; receiving, using the hardware processor, cloud usage activity data of the user; detecting, using the hardware processor, an anomaly in the cloud usage activity data of the user using the threat detection threshold of the user, the anomaly indicating a potential security risk associated with usage activities at the cloud-based service providers; and performing an action based on the detected anomaly.
 2. The method of claim 1, wherein performing an action based on the detected anomaly comprises ranking the detected anomaly among other detected anomalies based on risk levels.
 3. The method of claim 1, wherein performing an action based on the detected anomaly comprises filtering the detected anomaly and other detected anomalies to remove anomalies having a low-risk rating.
 4. The method of claim 1, wherein performing an action based on the detected anomaly comprises generating a report including the detected anomaly.
 5. The method of claim 1, wherein assigning, using the hardware processor, the user to the assigned user group of the user groups comprises: identifying the assigned user group as having similar cloud activity attribute values as the user, the cloud activity attribute values relating to cloud usage activity data being measured; and assigning the user to the assigned user group.
 6. The method of claim 5, wherein identifying the assigned user group as having similar cloud activity attribute values as the user comprises: identifying the assigned user group as having similar cloud activity attribute values as the user, the cloud activity attributes comprising a number of bytes of data in a service action, a number of times the one of the one or more cloud-based service providers has been visited, or a number of reports being accessed.
 7. A system for detecting anomalies in usage activities at one or more cloud-based service providers, comprising: a memory; and a hardware processor coupled to the memory and configured to: determine that a user does not have sufficient usage activity data to generate a user behavior model for the user; assign the user to an assigned user group of a plurality of user groups, wherein each of the plurality of user groups represents similar user behavior of users in the user group and has a corresponding generalized user behavior model; assign the generalized user behavior model of the assigned user group as a user behavior model for the user; generate a threat detection threshold for the user using coefficients of the user behavior model of the user; receive cloud usage activity data of the user; detect an anomaly in the cloud usage activity data of the user using the threat detection threshold of the user, the anomaly indicating a potential security risk associated with usage activities at the cloud-based service providers; and perform an action based on the detected anomaly.
 8. The system of claim 7, wherein the hardware processor, in performing an action based on the detected anomaly, ranks the detected anomaly among other detected anomalies based on risk levels.
 9. The system of claim 7, wherein the hardware processor, in performing an action based on the detected anomaly, filters the detected anomaly and other detected anomalies to remove anomalies having a low-risk rating.
 10. The system of claim 7, wherein the hardware processor, in performing an action based on the detected anomaly, generates a report including the detected anomaly.
 11. The system of claim 7, wherein the hardware processor, in assigning the user to the assigned user group of the user groups: identifies the assigned user group as having similar cloud activity attribute values as the user, the cloud activity attribute values relating to cloud usage activity data being measured; and assigns the user to the assigned user group.
 12. The system of claim 11, wherein the hardware processor, in identifying the assigned user group as having similar cloud activity attribute values as the user: identifies the assigned user group as having similar cloud activity attribute values as the user, the cloud activity attributes comprising a number of bytes of data in a service action, a number of times the one of the one or more cloud-based service providers has been visited, or a number of reports being accessed.
 13. A non-transitory computer-readable medium containing computer-executable instructions that, when executed by a processor, cause the processor to perform a method of detecting anomalies in usage activities at one or more cloud-based service providers, the method comprising: determining that a user does not have sufficient usage activity data to generate a user behavior model for the user; assigning the user to an assigned user group of a plurality of user groups, wherein each of the plurality of user groups represents similar user behavior of users in the user group and has a corresponding generalized user behavior model; assigning the generalized user behavior model of the assigned user group as a user behavior model for the user; generating a threat detection threshold for the user using coefficients of the user behavior model of the user; receiving cloud usage activity data of the user; detecting an anomaly in the cloud usage activity data of the user using the threat detection threshold of the user, the anomaly indicating a potential security risk associated with usage activities at the cloud-based service providers; and performing an action based on the detected anomaly.
 14. The non-transitory computer-readable medium of claim 13, wherein performing an action based on the detected anomaly comprises ranking the detected anomaly among other detected anomalies based on risk levels.
 15. The non-transitory computer-readable medium of claim 13, wherein performing an action based on the detected anomaly comprises filtering the detected anomaly and other detected anomalies to remove anomalies having a low-risk rating.
 16. The non-transitory computer-readable medium of claim 13, wherein performing an action based on the detected anomaly comprises generating a report including the detected anomaly.
 17. The non-transitory computer-readable medium of claim 13, wherein assigning the user to the assigned user group of the user groups comprises: identifying the assigned user group as having similar cloud activity attribute values as the user, the cloud activity attribute values relating to cloud usage activity data being measured; and assigning the user to the assigned user group.
 18. The non-transitory computer-readable medium of claim 17, wherein identifying the assigned user group as having similar cloud activity attribute values as the user comprises: identifying the assigned user group as having similar cloud activity attribute values as the user, the cloud activity attributes comprising a number of bytes of data in a service action, a number of times the one of the one or more cloud-based service providers has been visited, or a number of reports being accessed. 